Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Commun Biol ; 6(1): 169, 2023 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-36792689

RESUMO

Identifying network architecture from observed neural activities is crucial in neuroscience studies. A key requirement is knowledge of the statistical input-output relation of single neurons in vivo. By utilizing an exact analytical solution of the spike-timing for leaky integrate-and-fire neurons under noisy inputs balanced near the threshold, we construct a framework that links synaptic type, strength, and spiking nonlinearity with the statistics of neuronal population activity. The framework explains structured pairwise and higher-order interactions of neurons receiving common inputs under different architectures. We compared the theoretical predictions with the activity of monkey and mouse V1 neurons and found that excitatory inputs given to pairs explained the observed sparse activity characterized by strong negative triple-wise interactions, thereby ruling out the alternative explanation by shared inhibition. Moreover, we showed that the strong interactions are a signature of excitatory rather than inhibitory inputs whenever the spontaneous rate is low. We present a guide map of neural interactions that help researchers to specify the hidden neuronal motifs underlying observed interactions found in empirical data.


Assuntos
Rede Nervosa , Neurônios , Camundongos , Animais , Potenciais de Ação/fisiologia , Neurônios/fisiologia , Rede Nervosa/fisiologia , Modelos Neurológicos
2.
Artigo em Inglês | MEDLINE | ID: mdl-37015449

RESUMO

In this study, we improve the existing model for force distribution over the muscles by considering reflex excitation as a nonvoluntary mechanism of our neuromuscular system. The improved model can explain the large difference between biological torque and experimentally optimized assistive torque profiles. Accordingly, we hypothesize that the "nonvoluntary nature of reflexive excitation highly restricts biological torque compensation". The proposed model can also potentially characterize co-activation behavior in antagonistic muscles. Using our improved model, we introduce a well-posed framework to optimize the exoskeleton torque profile by metabolic rate minimization. METHODS: To support our hypothesis and the proposed method, we utilize two experimental datasets for exoskeleton torque optimization; passive and active ankle exoskeletons. First, we use the passive exoskeleton dataset to identify the parameters of our model; i.e., reflex gains. Then, to validate the proposed model, the identified parameters are used to optimize the exoskeleton torque profile for the second experimental study. LIMITATIONS: It is assumed that joint kinematic and reflex gains are fixed with and without exoskeleton. RESULTS: 74% of biological torque at the ankle joint cannot be experimentally compensated and the existing models can only explain that 17% of the biological torque is uncompensable. Our improved model can explain that 58% of biological torque is uncompensable (but still 16% remains unexplained). This achievement provides support for our hypothesis and shows undeniable contribution of reflex excitation for exoskeleton torque profile optimization.

3.
Sci Rep ; 11(1): 11846, 2021 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-34088911

RESUMO

Due to the complexity and high degrees of freedom, the detailed assessment of human biomechanics is necessary for the design and optimization of an effective exoskeleton. In this paper, we present full kinematics, dynamics, and biomechanics assessment of unpowered exoskeleton augmentation for human running gait. To do so, the considered case study is the assistive torque profile of I-RUN. Our approach is using some extensive data-driven OpenSim simulation results employing a generic lower limb model with 92-muscles and 29-DOF. In the simulation, it is observed that exoskeleton augmentation leads to [Formula: see text] metabolic rate reduction for the stiffness coefficient of [Formula: see text]. Moreover, this optimum stiffness coefficient minimizes the biological hip moment by [Formula: see text]. The optimum stiffness coefficient ([Formula: see text]) also reduces the average force of four major hip muscles, i.e., Psoas, Gluteus Maximus, Rectus Femoris, and Semimembranosus. The effect of assistive torque profile on the muscles' fatigue is also studied. Interestingly, it is observed that at [Formula: see text], both all 92 lower limb muscles' fatigue and two hip major mono-articular muscles' fatigue have the maximum reduction. This result re-confirm our hypothesis that "reducing the forces of two antagonistic mono-articular muscles is sufficient for involved muscles' total fatigue reduction." Finally, the relation between the amount of metabolic rate reduction and kinematics of hip joint is examined carefully where for the first time, we present a reliable kinematic index for prediction of the metabolic rate reduction by I-RUN augmentation. This index not only explains individual differences in metabolic rate reduction but also provides a quantitative measure for training the subjects to maximize their benefits from I-RUN.


Assuntos
Simulação por Computador , Eletromiografia/métodos , Exoesqueleto Energizado , Algoritmos , Fenômenos Biomecânicos , Marcha , Articulação do Quadril , Humanos , Extremidade Inferior/fisiologia , Destreza Motora , Músculo Esquelético/fisiologia , Músculos/metabolismo , Junção Neuromuscular , Corrida , Torque , Caminhada/fisiologia
4.
Cogn Psychol ; 118: 101272, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31972429

RESUMO

Heuristics, commonly thought to violate the full rationality assumptions, are paradoxically indispensable parts of our decision-making and learning processes. To resolve this seemingly paradox, there have been several studies in the literature that aim at finding some broad daily life conditions and situations where employing heuristics are rational. However, these researches mainly focus on non-social conditions, whereas, for human beings, social and individual processes are interwoven and it would be better to study them jointly. Here, we study the role of pruning heuristic in individual reinforcement learning in a social context, where our simulated learning agents make many of their decisions relying on others' knowledge. Our simulation results suggest that the seemingly irrational pruning heuristic leads to less cost in the social settings. That is, we have a meaningfully more social outcome in the presence of this heuristic in social contexts, and social learning helps the agents to learn better where the pruning heuristic is an obstacle in the way of finding the optimal solution in the individual setting. In sum, the synergy between the pruning behavior and social learning leads to ecological rationality.


Assuntos
Tomada de Decisões , Objetivos , Heurística , Modelos Psicológicos , Reforço Psicológico , Humanos , Conhecimento , Meio Social
5.
IEEE Trans Neural Netw Learn Syst ; 30(6): 1635-1650, 2019 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-30307878

RESUMO

Due to the lack of enough generalization in the state space, common methods of reinforcement learning suffer from slow learning speed, especially in the early learning trials. This paper introduces a model-based method in discrete state spaces for increasing the learning speed in terms of required experiences (but not required computation time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation. Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the state space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing (PA) in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called model-based learning with subspaces (MoBLeSs), calculates the confidence intervals of the estimated Q -values in the state space and in the subspaces. These confidence intervals are used in the decision-making, such that the agent benefits the most from the possible generalization while avoiding from the detriment of the PA in the subspaces. The convergence of MoBLeS to the optimal policy is theoretically investigated. In addition, we show through several experiments that MoBLeS improves the learning speed in the early trials.

6.
IEEE Trans Neural Syst Rehabil Eng ; 26(10): 2026-2032, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30281466

RESUMO

In this paper, we present a new perspective to design an unpowered exoskeleton for metabolic rate reduction in running. According to our studies on human biomechanics, it was observed that having a torsional spring that applies torque as a linear function of the difference between two hips angles ( -angle), compared with a local spring which applies torque as a function of hip angle ( -angle), provides a better condition for hip moment compensation and, consequently, metabolic rate reduction. Accordingly, a new type of unpowered exoskeleton device for realization of this idea was designed, and a prototype of this exoskeleton was constructed. This exoskeleton was tested on 10 healthy active subjects for running at 2.5 m s-1. In this experiment, 8.0 ± 1.5% (mean ± s.e.m.) metabolic rate reduction (compared with the no-exoskeleton case) was achieved.


Assuntos
Metabolismo Energético , Exoesqueleto Energizado , Corrida/fisiologia , Adulto , Fenômenos Biomecânicos , Desenho de Equipamento , Voluntários Saudáveis , Quadril/anatomia & histologia , Quadril/fisiologia , Humanos , Masculino , Aparelhos Ortopédicos , Consumo de Oxigênio/fisiologia , Torque , Caminhada/fisiologia , Adulto Jovem
7.
Front Hum Neurosci ; 12: 507, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30687039

RESUMO

In conflict tasks, like the Simon task, it is usually investigated how task-irrelevant information affects the processing of task-relevant information. In the present experiments, we extended the Simon task to a multimodal setup, in which task-irrelevant information emerged from two sensory modalities. Specifically, in Experiment 1, participants responded to the identity of letters presented at a left, right, or central position with a left- or right-hand response. Additional tactile stimulation occurred on a left, right, or central position on the horizontal body plane. Response congruency of the visual and tactile stimulation was orthogonally varied. In Experiment 2, the tactile stimulation was replaced by auditory stimulation. In both experiments, the visual task-irrelevant information produced congruency effects such that responses were slower and less accurate in incongruent than incongruent conditions. Furthermore, in Experiment 1, such congruency effects, albeit smaller, were also observed for the tactile task-irrelevant stimulation. In Experiment 2, the auditory task-irrelevant stimulation produced the smallest effects. Specifically, the longest reaction times emerged in the neutral condition, while incongruent and congruent conditions differed only numerically. This suggests that in the co-presence of multiple task-irrelevant information sources, location processing is more strongly determined by visual and tactile spatial information than by auditory spatial information. An extended version of the Diffusion Model for Conflict Tasks (DMC) was fitted to the results of both experiments. This Multimodal Diffusion Model for Conflict Tasks (MDMC), and a model variant involving faster processing in the neutral visual condition (FN-MDMC), provided reasonable fits for the observed data. These model fits support the notion that multimodal task-irrelevant information superimposes across sensory modalities and automatically affects the controlled processing of task-relevant information.

8.
J Comput Neurosci ; 44(2): 147-171, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29192377

RESUMO

The noisy threshold regime, where even a small set of presynaptic neurons can significantly affect postsynaptic spike-timing, is suggested as a key requisite for computation in neurons with high variability. It also has been proposed that signals under the noisy conditions are successfully transferred by a few strong synapses and/or by an assembly of nearly synchronous synaptic activities. We analytically investigate the impact of a transient signaling input on a leaky integrate-and-fire postsynaptic neuron that receives background noise near the threshold regime. The signaling input models a single strong synapse or a set of synchronous synapses, while the background noise represents a lot of weak synapses. We find an analytic solution that explains how the first-passage time (ISI) density is changed by transient signaling input. The analysis allows us to connect properties of the signaling input like spike timing and amplitude with postsynaptic first-passage time density in a noisy environment. Based on the analytic solution, we calculate the Fisher information with respect to the signaling input's amplitude. For a wide range of amplitudes, we observe a non-monotonic behavior for the Fisher information as a function of background noise. Moreover, Fisher information non-trivially depends on the signaling input's amplitude; changing the amplitude, we observe one maximum in the high level of the background noise. The single maximum splits into two maximums in the low noise regime. This finding demonstrates the benefit of the analytic solution in investigating signal transfer by neurons.


Assuntos
Potenciais de Ação/fisiologia , Modelos Neurológicos , Neurônios/fisiologia , Transdução de Sinais/fisiologia , Sinapses/fisiologia , Animais , Simulação por Computador , Tempo de Reação
9.
Sci Rep ; 7(1): 7052, 2017 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-28765624

RESUMO

Drug addiction has been associated with lack of insight into one's own abilities. However, the scope of metacognition impairment among drug users in general and opiate dependent individuals in particular is not fully understood. Investigating the impairments of metacognitive ability in Substance Dependent Individuals (SDIs) in different cognitive tasks could contribute to the ongoing debate over whether metacognition has domain-general or domain-specific neural substrates. We compared metacognitive self-monitoring ability of a group of SDIs during methadone maintenance treatment (n = 23) with a control group (n = 24) in a memory and a visual perceptual task. Post decision self judgements of probability of correct choice were obtained through trial by trial confidence ratings and were used to compute metacognitive ability. Results showed that despite comparable first order performance in the perceptual task, SDIs had lower perceptual metacognition than the control group. However, although SDIs had poorer memory performance, their metacognitive judgements in the memory task were as accurate as the control group. While it is commonly believed that addiction causes pervasive impairment in cognitive functions, including metacognitive ability, we observed that the impairment was only significant in one specific task, the perceptual task, but not in the memory task.


Assuntos
Analgésicos Opioides/efeitos adversos , Quimioterapia de Manutenção/efeitos adversos , Memória/efeitos dos fármacos , Transtornos Mentais/induzido quimicamente , Metacognição/efeitos dos fármacos , Metadona/efeitos adversos , Percepção/efeitos dos fármacos , Analgésicos Opioides/administração & dosagem , Humanos , Quimioterapia de Manutenção/métodos , Metadona/administração & dosagem
10.
Sci Rep ; 7(1): 3167, 2017 06 09.
Artigo em Inglês | MEDLINE | ID: mdl-28600573

RESUMO

Two psychophysical experiments examined multisensory integration of visual-auditory (Experiment 1) and visual-tactile-auditory (Experiment 2) signals. Participants judged the location of these multimodal signals relative to a standard presented at the median plane of the body. A cue conflict was induced by presenting the visual signals with a constant spatial discrepancy to the other modalities. Extending previous studies, the reliability of certain modalities (visual in Experiment 1, visual and tactile in Experiment 2) was varied from trial to trial by presenting signals with either strong or weak location information (e.g., a relatively dense or dispersed dot cloud as visual stimulus). We investigated how participants would adapt to the cue conflict from the contradictory information under these varying reliability conditions and whether participants had insight to their performance. During the course of both experiments, participants switched from an integration strategy to a selection strategy in Experiment 1 and to a calibration strategy in Experiment 2. Simulations of various multisensory perception strategies proposed that optimal causal inference in a varying reliability environment not only depends on the amount of multimodal discrepancy, but also on the relative reliability of stimuli across the reliability conditions.


Assuntos
Percepção Auditiva/fisiologia , Conflito Psicológico , Processamento Espacial/fisiologia , Percepção do Tato/fisiologia , Percepção Visual/fisiologia , Estimulação Acústica , Sinais (Psicologia) , Humanos , Masculino , Testes Neuropsicológicos , Estimulação Luminosa , Adulto Jovem
11.
Sci Rep ; 7(1): 1709, 2017 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-28490773

RESUMO

Neuronal networks of the brain adapt their information processing according to the history of stimuli. Whereas most studies have linked adaptation to repetition suppression, recurrent connections within a network and disinhibition due to adaptation predict more complex response patterns. The main questions of this study are as follows: what is the effect of the selectivity of neurons on suppression/enhancement of neural responses? What are the consequences of adaptation on information representation in neural population and the temporal structure of response patterns? We studied rapid face adaptation using spiking activities of neurons in the inferior-temporal (IT) cortex. Investigating the responses of neurons, within a wide range from negative to positive face selectivity, showed that despite the peak amplitude suppression in highly positive selective neurons, responses were enhanced in most other neurons. This enhancement can be attributed to disinhibition due to adaptation. Delayed and distributed responses were observed for positive selective neurons. Principal component analysis of the IT population responses over time revealed that repetition of face stimuli resulted in temporal decorrelation of the network activity. The contributions of the main and higher neuronal dimensions were changed under an adaptation condition, where more neuronal dimensions were used to encode repeated face stimuli.


Assuntos
Adaptação Fisiológica , Neurônios/fisiologia , Lobo Temporal/fisiologia , Potenciais de Ação/fisiologia , Animais , Face , Macaca mulatta , Masculino , Análise de Componente Principal , Razão Sinal-Ruído , Fatores de Tempo
12.
J Neurophysiol ; 116(2): 587-601, 2016 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-27169503

RESUMO

Object categories are recognized at multiple levels of hierarchical abstractions. Psychophysical studies have shown a more rapid perceptual access to the mid-level category information (e.g., human faces) than the higher (superordinate; e.g., animal) or the lower (subordinate; e.g., face identity) level. Mid-level category members share many features, whereas few features are shared among members of different mid-level categories. To understand better the neural basis of expedited access to mid-level category information, we examined neural responses of the inferior temporal (IT) cortex of macaque monkeys viewing a large number of object images. We found an earlier representation of mid-level categories in the IT population and single-unit responses compared with superordinate- and subordinate-level categories. The short-latency representation of mid-level category information shows that visual cortex first divides the category shape space at its sharpest boundaries, defined by high/low within/between-group similarity. This short-latency, mid-level category boundary map may be a prerequisite for representation of other categories at more global and finer scales.


Assuntos
Mapeamento Encefálico , Neurônios/fisiologia , Dinâmica não Linear , Reconhecimento Visual de Modelos/fisiologia , Lobo Temporal/citologia , Animais , Simulação por Computador , Macaca , Masculino , Modelos Neurológicos , Estimulação Luminosa , Análise de Componente Principal , Curva ROC , Tempo de Reação , Máquina de Vetores de Suporte , Lobo Temporal/fisiologia , Fatores de Tempo , Vias Visuais/fisiologia
13.
Comput Intell Neurosci ; 2015: 905421, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26185494

RESUMO

It has been argued that concepts can be perceived at three main levels of abstraction. Generally, in a recognition system, object categories can be viewed at three levels of taxonomic hierarchy which are known as superordinate, basic, and subordinate levels. For instance, "horse" is a member of subordinate level which belongs to basic level of "animal" and superordinate level of "natural objects." Our purpose in this study is to take an investigation into visual features at each taxonomic level. We first present a recognition tree which is more general in terms of inclusiveness with respect to visual representation of objects. Then we focus on visual feature definition, that is, how objects from the same conceptual category can be visually represented at each taxonomic level. For the first level we define global features based on frequency patterns to illustrate visual distinctions among artificial and natural. In contrast, our approach for the second level is based on shape descriptors which are defined by recruiting moment based representation. Finally, we show how conceptual knowledge can be utilized for visual feature definition in order to enhance recognition of subordinate categories.


Assuntos
Algoritmos , Classificação/métodos , Formação de Conceito , Modelos Teóricos , Nomes , Reconhecimento Visual de Modelos , Animais , Plantas
14.
PLoS One ; 9(7): e103143, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25058591

RESUMO

In a multisensory task, human adults integrate information from different sensory modalities--behaviorally in an optimal Bayesian fashion--while children mostly rely on a single sensor modality for decision making. The reason behind this change of behavior over age and the process behind learning the required statistics for optimal integration are still unclear and have not been justified by the conventional Bayesian modeling. We propose an interactive multisensory learning framework without making any prior assumptions about the sensory models. In this framework, learning in every modality and in their joint space is done in parallel using a single-step reinforcement learning method. A simple statistical test on confidence intervals on the mean of reward distributions is used to select the most informative source of information among the individual modalities and the joint space. Analyses of the method and the simulation results on a multimodal localization task show that the learning system autonomously starts with sensory selection and gradually switches to sensory integration. This is because, relying more on modalities--i.e. selection--at early learning steps (childhood) is more rewarding than favoring decisions learned in the joint space since, smaller state-space in modalities results in faster learning in every individual modality. In contrast, after gaining sufficient experiences (adulthood), the quality of learning in the joint space matures while learning in modalities suffers from insufficient accuracy due to perceptual aliasing. It results in tighter confidence interval for the joint space and consequently causes a smooth shift from selection to integration. It suggests that sensory selection and integration are emergent behavior and both are outputs of a single reward maximization process; i.e. the transition is not a preprogrammed phenomenon.


Assuntos
Envelhecimento/psicologia , Aprendizagem/fisiologia , Modelos Neurológicos , Percepção/fisiologia , Reforço Psicológico , Recompensa , Adulto , Teorema de Bayes , Criança , Biologia Computacional , Tomada de Decisões , Humanos , Limiar Sensorial/fisiologia
15.
PLoS One ; 8(12): e81195, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24324677

RESUMO

Little is known about how people learn to take into account others' opinions in joint decisions. To address this question, we combined computational and empirical approaches. Human dyads made individual and joint visual perceptual decision and rated their confidence in those decisions (data previously published). We trained a reinforcement (temporal difference) learning agent to get the participants' confidence level and learn to arrive at a dyadic decision by finding the policy that either maximized the accuracy of the model decisions or maximally conformed to the empirical dyadic decisions. When confidences were shared visually without verbal interaction, RL agents successfully captured social learning. When participants exchanged confidences visually and interacted verbally, no collective benefit was achieved and the model failed to predict the dyadic behaviour. Behaviourally, dyad members' confidence increased progressively and verbal interaction accelerated this escalation. The success of the model in drawing collective benefit from dyad members was inversely related to confidence escalation rate. The findings show an automated learning agent can, in principle, combine individual opinions and achieve collective benefit but the same agent cannot discount the escalation suggesting that one cognitive component of collective decision making in human may involve discounting of overconfidence arising from interactions.


Assuntos
Tomada de Decisões , Emoções , Aprendizagem , Comportamento Social , Humanos , Masculino , Modelos Teóricos , Estimulação Luminosa , Reforço Psicológico , Adulto Jovem
16.
Bioinspir Biomim ; 8(3): 036002, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23735558

RESUMO

A new control approach to achieve robust hopping against perturbations in the sagittal plane is presented in this paper. In perturbed hopping, vertical body alignment has a significant role for stability. Our approach is based on the virtual pendulum concept, recently proposed, based on experimental findings in human and animal locomotion. In this concept, the ground reaction forces are pointed to a virtual support point, named virtual pivot point (VPP), during motion. This concept is employed in designing the controller to balance the trunk during the stance phase. New strategies for leg angle and length adjustment besides the virtual pendulum posture control are proposed as a unified controller. This method is investigated by applying it on an extension of the spring loaded inverted pendulum (SLIP) model. Trunk, leg mass and damping are added to the SLIP model in order to make the model more realistic. The stability is analyzed by Poincaré map analysis. With fixed VPP position, stability, disturbance rejection and moderate robustness are achieved, but with a low convergence speed. To improve the performance and attain higher robustness, an event-based control of the VPP position is introduced, using feedback of the system states at apexes. Discrete linear quartic regulator is used to design the feedback controller. Considerable enhancements with respect to stability, convergence speed and robustness against perturbations and parameter changes are achieved.


Assuntos
Biomimética/métodos , Marcha/fisiologia , Perna (Membro)/fisiologia , Modelos Biológicos , Equilíbrio Postural/fisiologia , Postura/fisiologia , Robótica/métodos , Animais , Simulação por Computador , Humanos
17.
Neural Comput ; 23(2): 558-91, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21105824

RESUMO

In this letter, we propose a learning system, active decision fusion learning (ADFL), for active fusion of decisions. Each decision maker, referred to as a local decision maker, provides its suggestion in the form of a probability distribution over all possible decisions. The goal of the system is to learn the active sequential selection of the local decision makers in order to consult with and thus learn the final decision based on the consultations. These two learning tasks are formulated as learning a single sequential decision-making problem in the form of a Markov decision process (MDP), and a continuous reinforcement learning method is employed to solve it. The states of this MDP are decisions of the attended local decision makers, and the actions are either attending to a local decision maker or declaring final decisions. The learning system is punished for each consultation and wrong final decision and rewarded for correct final decisions. This results in minimizing the consultation and decision-making costs through learning a sequential consultation policy where the most informative local decision makers are consulted and the least informative, misleading, and redundant ones are left unattended. An important property of this policy is that it acts locally. This means that the system handles any nonuniformity in the local decision maker's expertise over the state space. This property has been exploited in the design of local experts. ADFL is tested on a set of classification tasks, where it outperforms two well-known classification methods, Adaboost and bagging, as well as three benchmark fusion algorithms: OWA, Borda count, and majority voting. In addition, the effect of local experts design strategy on the performance of ADFL is studied, and some guidelines for the design of local experts are provided. Moreover, evaluating ADFL in some special cases proves that it is able to derive the maximum benefit from the informative local decision makers and to minimize attending to redundant ones.


Assuntos
Técnicas de Apoio para a Decisão , Redes Neurais de Computação , Humanos , Aprendizagem
18.
IEEE Trans Syst Man Cybern B Cybern ; 37(2): 398-409, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17416167

RESUMO

Cooperation in learning (CL) can be realized in a multiagent system, if agents are capable of learning from both their own experiments and other agents' knowledge and expertise. Extra resources are exploited into higher efficiency and faster learning in CL as compared to that of individual learning (IL). In the real world, however, implementation of CL is not a straightforward task, in part due to possible differences in area of expertise (AOE). In this paper, reinforcement-learning homogenous agents are considered in an environment with multiple goals or tasks. As a result, they become expert in different domains with different amounts of expertness. Each agent uses a one-step Q-learning algorithm and is capable of exchanging its Q-table with those of its teammates. Two crucial questions are addressed in this paper: "How the AOE of an agent can be extracted?" and "How agents can improve their performance in CL by knowing their AOEs?" An algorithm is developed to extract the AOE based on state transitions as a gold standard from a behavioral point of view. Moreover, it is discussed that the AOE can be implicitly obtained through agents' expertness in the state level. Three new methods for CL through the combination of Q-tables are developed and examined for overall performance after CL. The performances of developed methods are compared with that of IL, strategy sharing (SS), and weighted SS (WSS). Obtained results show the superior performance of AOE-based methods as compared to that of existing CL methods, which do not use the notion of AOE. These results are very encouraging in support of the idea that "cooperation based on the AOE" performs better than the general CL methods.


Assuntos
Algoritmos , Inteligência Artificial , Comportamento Cooperativo , Técnicas de Apoio para a Decisão , Sistemas Inteligentes , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Simulação por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...